Distinguishing Microbial Genome Fragments Based on Their Composition: Evolutionary and Comparative Genomic Perspectives

نویسندگان

  • Scott C. Perry
  • Robert G. Beiko
چکیده

It is well known that patterns of nucleotide composition vary within and among genomes, although the reasons why these variations exist are not completely understood. Between-genome compositional variation has been exploited to assign environmental shotgun sequences to their most likely originating genomes, whereas within-genome variation has been used to identify recently acquired genetic material such as pathogenicity islands. Recent sequence assignment techniques have achieved high levels of accuracy on artificial data sets, but the relative difficulty of distinguishing lineages with varying degrees of relatedness, and different types of genomic sequence, has not been examined in depth. We investigated the compositional differences in a set of 774 sequenced microbial genomes, finding rapid divergence among closely related genomes, but also convergence of compositional patterns among genomes with similar habitats. Support vector machines were then used to distinguish all pairs of genomes based on genome fragments 500 nucleotides in length. The nearly 300,000 accuracy scores obtained from these trials were used to construct general models of distinguishability versus taxonomic and compositional indices of genomic divergence. Unusual genome pairs were evident from their large residuals relative to the fitted model, and we identified several factors including genome reduction, putative lateral genetic transfer, and habitat convergence that influence the distinguishability of genomes. The positional, compositional, and functional context of a fragment within a genome has a strong influence on its likelihood of correct classification, but in a way that depends on the taxonomic and ecological similarity of the comparator genome.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Phylogenetic Perspectives on the Evolutionary Relationships in the Brine Shrimp Artemia Leach, 1819 (Crustacea: Anostraca) Based on Secondary Structure of ITS1 Gene

This is the first study on phylogenetic relationships in the genus Artemia Leach, 1819 using the pattern and sequence of secondary structures of internal transcribed spacer 1 (ITS1). Significant intraspecific variation in the secondary structure of ITS1 rRNA was found in Artemia tibetiana. In the phylogenetic tree based on joined primary and secondary structure sequences, Artemia urmiana and pa...

متن کامل

Editorial: Applications of Genome Sequences for Discovering Characteristics that Are Unique to Different Groups of Organisms and Provide Insights into Evolutionary Relationships

Genome sequences are making available an unprecedented amount of genetic information that has the potential to reliably elucidate many aspects of physiology, biochemistry, and evolutionary relationships of different organisms. For efficient analyses of the vast amount of genomic sequence data, new reductive approaches are needed which can identify reliable genetic characteristics that are speci...

متن کامل

Genome composition and phylogeny of microbes predict their co-occurrence in the environment

The genomic information of microbes is a major determinant of their phenotypic properties, yet it is largely unknown to what extent ecological associations between different species can be explained by their genome composition. To bridge this gap, this study introduces two new genome-wide pairwise measures of microbe-microbe interaction. The first (genome content similarity index) quantifies si...

متن کامل

Inferring Horizontal Gene Transfer

Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as ...

متن کامل

2005: an EPA odyssey.

MobilomeFINDER (http://mml.sjtu.edu.cn/Mobilome FINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2010